<郭伊軒><2025-10-30>Main Findings and Takeaways:
在非季節性折扣時點前後,其玩家數、追蹤者數與好評率變化有折扣前下降折扣後上升的趨勢。
是否發行DLC相較於是否發行續作有較高的打折機率。
Future Direciton:
比較競爭對手活動 / 市場整體狀況:同類型遊戲中打折比例、同時期平均價格下降幅度。
比較不同遊戲類型或發行商之間的差異:分析不同遊戲類型或發行商在折扣策略上的差異。
結合機器學習模型進行折扣預測:例如利用 Random Forest 或 XGBoost 模型預測折扣時點,以驗證變數重要性並提升實務應用價值。
# Load packages here
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import timedelta
from sklearn.preprocessing import StandardScaler
input_data_file = "/Users/10610/Desktop/114-1 資料/steam-project/discount-timing-DE.csv"
df = pd.read_csv(input_data_file)
df['Date'] = pd.to_datetime(df['Date'])
game_id = df['GameID'].unique() # 取得所有獨特的 GameID
game_dfs = [] # 用來裝每個遊戲的 DataFrame
for gid in game_id:
sub_df = df[df['GameID'] == gid] # 篩選出該 GameID 的資料
game_dfs.append(sub_df)
df.head()
| Date | GameID | Type | MultiPlayer | Publisher | ConstantDiscount | DiscountOrNot | DiscountDuration | DiscountFreq3M | Age | ... | FollowersGrowthRate1M | PositiveRateGrowthRate1W | PositiveRateGrowthRate2W | PositiveRateGrowthRate1M | DLC_sum_1W | DLC_sum_2W | DLC_sum_1M | Sequel_sum_1W | Sequel_sum_2W | Sequel_sum_1M | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2023-05-01 | 10 | Action | 1 | Valve | 0 | 0 | 0 | 1 | 22.509589 | ... | 0.003889 | 0.000012 | -0.000014 | 1.098178e-05 | 0 | 0 | 0 | 0 | 0 | 0 |
| 1 | 2023-05-02 | 10 | Action | 1 | Valve | 0 | 0 | 0 | 1 | 22.512329 | ... | 0.003913 | -0.000010 | -0.000036 | -4.698912e-07 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | 2023-05-03 | 10 | Action | 1 | Valve | 0 | 0 | 0 | 1 | 22.515068 | ... | 0.003979 | -0.000011 | -0.000041 | -8.082766e-07 | 0 | 0 | 0 | 0 | 0 | 0 |
| 3 | 2023-05-04 | 10 | Action | 1 | Valve | 0 | 0 | 0 | 1 | 22.517808 | ... | 0.004101 | -0.000012 | -0.000050 | -2.450820e-05 | 0 | 0 | 0 | 0 | 0 | 0 |
| 4 | 2023-05-05 | 10 | Action | 1 | Valve | 0 | 0 | 0 | 1 | 22.520548 | ... | 0.003912 | -0.000023 | -0.000053 | -3.754777e-05 | 0 | 0 | 0 | 0 | 0 | 0 |
5 rows × 29 columns
df.describe().T
| count | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|
| GameID | 23938.0 | 461376.742000 | 298559.181056 | 10.000000 | 244850.000000 | 431730.000000 | 644930.000000 | 1.145360e+06 |
| MultiPlayer | 23938.0 | 0.464241 | 0.498730 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 1.000000e+00 |
| ConstantDiscount | 23938.0 | 0.214387 | 0.410405 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000e+00 |
| DiscountOrNot | 23938.0 | 0.019885 | 0.139607 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000e+00 |
| DiscountDuration | 23938.0 | 0.221196 | 1.715483 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 3.200000e+01 |
| DiscountFreq3M | 23938.0 | 1.797644 | 1.043279 | 0.000000 | 1.000000 | 2.000000 | 3.000000 | 6.000000e+00 |
| Age | 23938.0 | 7.634427 | 4.458471 | 2.389041 | 4.951370 | 6.323288 | 8.479452 | 2.484658e+01 |
| AccumulatedPositiveRate | 23938.0 | 0.928061 | 0.064186 | 0.738751 | 0.905517 | 0.953165 | 0.972651 | 9.929734e-01 |
| SalePeriod | 23938.0 | 0.146420 | 0.353534 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000e+00 |
| DiscountDuringSale | 23938.0 | 0.008647 | 0.092590 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000e+00 |
| DiscountOutOfSale | 23938.0 | 0.011237 | 0.105411 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000e+00 |
| PlayerGrowthRate1W | 23938.0 | 0.020047 | 0.288771 | -0.592919 | -0.063356 | -0.014085 | 0.035173 | 1.131120e+01 |
| PlayerGrowthRate2W | 23938.0 | 0.032978 | 0.370759 | -0.726683 | -0.088489 | -0.013852 | 0.060643 | 1.088149e+01 |
| PlayerGrowthRate1M | 23938.0 | 0.039727 | 0.395094 | -0.768049 | -0.108811 | -0.009196 | 0.090659 | 7.285318e+00 |
| FollowersGrowthRate1W | 23938.0 | 0.001576 | 0.001482 | -0.000137 | 0.000640 | 0.001111 | 0.002078 | 2.296777e-02 |
| FollowersGrowthRate2W | 23938.0 | 0.003159 | 0.002758 | -0.000176 | 0.001343 | 0.002289 | 0.004157 | 3.623771e-02 |
| FollowersGrowthRate1M | 23938.0 | 0.006816 | 0.005480 | 0.000085 | 0.003142 | 0.005071 | 0.009104 | 5.282514e-02 |
| PositiveRateGrowthRate1W | 23938.0 | 0.000017 | 0.000332 | -0.015190 | -0.000025 | 0.000005 | 0.000044 | 6.059725e-03 |
| PositiveRateGrowthRate2W | 23938.0 | 0.000033 | 0.000569 | -0.016712 | -0.000041 | 0.000009 | 0.000077 | 9.389808e-03 |
| PositiveRateGrowthRate1M | 23938.0 | 0.000070 | 0.001033 | -0.017811 | -0.000078 | 0.000015 | 0.000151 | 1.392160e-02 |
| DLC_sum_1W | 23938.0 | 0.004679 | 0.071822 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 2.000000e+00 |
| DLC_sum_2W | 23938.0 | 0.009358 | 0.101767 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 2.000000e+00 |
| DLC_sum_1M | 23938.0 | 0.020553 | 0.150182 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 2.000000e+00 |
| Sequel_sum_1W | 23938.0 | 0.001170 | 0.034181 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000e+00 |
| Sequel_sum_2W | 23938.0 | 0.002339 | 0.048312 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000e+00 |
| Sequel_sum_1M | 23938.0 | 0.005013 | 0.070626 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000e+00 |
Make the graphs, summary statistics, regression model below. Make sure you have followed the guidelines as specified in 專案資料夾結構、檔案命名與文件規範.
for index, sub_df in enumerate(game_dfs):
dlc_first_day = sub_df[
(sub_df['DLC_sum_1W'] > 0) &
(sub_df['DLC_sum_1W'].shift(1) == sub_df['DLC_sum_1W'] - 1)
]
sequel_first_day = sub_df[
(sub_df['Sequel_sum_1W'] > 0) &
(sub_df['Sequel_sum_1W'].shift(1) == sub_df['Sequel_sum_1W'] - 1)
]
fig, axes = plt.subplots(3, 1, figsize=(15, 9), sharex=True)
# ---- 大圖標題 & 全圖Y軸標籤 ----
gid = sub_df['GameID'].iloc[0] # 假設每個sub_df都有GameID欄位
fig.suptitle(f'Game {gid} Time Series', fontsize=14, y=0.95) # 給整張圖標題
fig.text(0.04, 0.5, 'Rate', va='center', rotation='vertical', fontsize=12) # 給整張圖的y軸標籤
# ---- 各子圖 ----
axes[0].plot(sub_df['Date'], sub_df['PlayerGrowthRate1W'], label='Players', color='blue')
axes[0].set_title('PlayerGrowthRate1W')
axes[0].scatter(dlc_first_day['Date'],
dlc_first_day['PlayerGrowthRate1W'],
color='gray', s=40, marker='o', label='DLC Release', zorder=5)
axes[0].scatter(sequel_first_day['Date'],
sequel_first_day['PlayerGrowthRate1W'],
color='purple', s=40, marker='o', label='Sequel Release', zorder=5)
axes[1].plot(sub_df['Date'], sub_df['FollowersGrowthRate1W'], label='Followers', color='orange')
axes[1].set_title('FollowersGrowthRate1W')
axes[1].scatter(dlc_first_day['Date'],
dlc_first_day['FollowersGrowthRate1W'],
color='gray', s=40, marker='o', label='DLC Release', zorder=5)
axes[1].scatter(sequel_first_day['Date'],
sequel_first_day['FollowersGrowthRate1W'],
color='purple', s=40, marker='o', label='Sequel Release', zorder=5)
axes[2].plot(sub_df['Date'], sub_df['PositiveRateGrowthRate1W'], label='Positive Rate', color='red')
axes[2].set_title('PositiveRateGrowthRate1W')
axes[2].scatter(dlc_first_day['Date'],
dlc_first_day['PositiveRateGrowthRate1W'],
color='gray', s=40, marker='o', label='DLC Release', zorder=5)
axes[2].scatter(sequel_first_day['Date'],
sequel_first_day['PositiveRateGrowthRate1W'],
color='purple', s=40, marker='o', label='Sequel Release', zorder=5)
# ---- 標記打折期間 ----
sale_labels = {'Seasonal Sale': False, 'Non-Seasonal Sale': False}
for _, row in sub_df.iterrows():
if row['DiscountOrNot'] == 1:
start_date = row['Date']
end_date = row['Date'] + timedelta(days=row['DiscountDuration'])
# 判斷季節/非季節折扣
if row['SalePeriod'] == 1:
color = 'red'
label = 'Seasonal Sale'
else:
color = 'blue'
label = 'Non-Seasonal Sale'
# 只讓第一次出現的區間加上 label
for ax in axes:
ax.axvspan(start_date, end_date, color=color, alpha=0.15,
label=label if not sale_labels[label] else None)
sale_labels[label] = True
# ---- 共用設定 ----
for ax in axes:
ax.legend()
ax.grid(True, linestyle='--', alpha=0.3)
plt.xlabel('Date')
plt.xticks(rotation=45)
plt.tight_layout(rect=[0.05, 0.05, 1, 0.95]) # 保留空間給標題和y標籤
plt.show()
if index == 5:
break
window = 7
event_data = []
for _, row in df[df['DiscountOrNot'] == 1].iterrows():
date = row['Date']
event_window = df[(df['Date'] >= date - pd.Timedelta(days=window)) &
(df['Date'] <= date + pd.Timedelta(days=window))].copy()
event_window['DaysFromEvent'] = (event_window['Date'] - date).dt.days
event_data.append(event_window)
event_df = pd.concat(event_data)
avg_change = event_df.groupby('DaysFromEvent')[['PlayerGrowthRate1W', 'FollowersGrowthRate1W', 'PositiveRateGrowthRate1W']].mean()
scaler = StandardScaler()
avg_change_scaled = avg_change.copy()
avg_change_scaled[:] = scaler.fit_transform(avg_change)
avg_change_scaled.plot(title='Standardized Change Around Discount Start', figsize=(10,5))
plt.axvline(0, color='red', linestyle='--', label='Discount Start')
plt.legend()
plt.show()
window = 7
event_data = []
for _, row in df[df['DiscountDuringSale'] == 1].iterrows():
date = row['Date']
event_window = df[(df['Date'] >= date - pd.Timedelta(days=window)) &
(df['Date'] <= date + pd.Timedelta(days=window))].copy()
event_window['DaysFromEvent'] = (event_window['Date'] - date).dt.days
event_data.append(event_window)
event_df = pd.concat(event_data)
avg_change = event_df.groupby('DaysFromEvent')[['PlayerGrowthRate1W', 'FollowersGrowthRate1W', 'PositiveRateGrowthRate1W']].mean()
scaler = StandardScaler()
avg_change_scaled = avg_change.copy()
avg_change_scaled[:] = scaler.fit_transform(avg_change)
avg_change_scaled.plot(title='Standardized Change Around Seasonal Discount Start', figsize=(10,5))
plt.axvline(0, color='red', linestyle='--', label='Discount Start')
plt.legend()
plt.show()
window = 7
event_data = []
for _, row in df[df['DiscountOutOfSale'] == 1].iterrows():
date = row['Date']
event_window = df[(df['Date'] >= date - pd.Timedelta(days=window)) &
(df['Date'] <= date + pd.Timedelta(days=window))].copy()
event_window['DaysFromEvent'] = (event_window['Date'] - date).dt.days
event_data.append(event_window)
event_df = pd.concat(event_data)
avg_change = event_df.groupby('DaysFromEvent')[['PlayerGrowthRate1W', 'FollowersGrowthRate1W', 'PositiveRateGrowthRate1W']].mean()
scaler = StandardScaler()
avg_change_scaled = avg_change.copy()
avg_change_scaled[:] = scaler.fit_transform(avg_change)
avg_change_scaled.plot(title='Standardized Change Around Non-seasonal Discount Start', figsize=(10,5))
plt.axvline(0, color='red', linestyle='--', label='Discount Start')
plt.legend()
plt.show()
for index, sub_df in enumerate(game_dfs):
dlc_first_day = sub_df[
(sub_df['DLC_sum_1W'] == 1) &
(sub_df['DLC_sum_1W'].shift(1) != 1)
]
sequel_first_day = sub_df[
(sub_df['Sequel_sum_1W'] == 1) &
(sub_df['Sequel_sum_1W'].shift(1) != 1)
]
fig, axes = plt.subplots(3, 1, figsize=(15, 9), sharex=True)
# ---- 大圖標題 & 全圖Y軸標籤 ----
gid = sub_df['GameID'].iloc[0] # 假設每個sub_df都有GameID欄位
fig.suptitle(f'Game {gid} Time Series', fontsize=14, y=0.95) # 給整張圖標題
fig.text(0.04, 0.5, 'Rate', va='center', rotation='vertical', fontsize=12) # 給整張圖的y軸標籤
# ---- 各子圖 ----
axes[0].plot(sub_df['Date'], sub_df['PlayerGrowthRate2W'], label='Players', color='blue')
axes[0].set_title('PlayerGrowthRate2W')
axes[0].scatter(dlc_first_day['Date'],
dlc_first_day['PlayerGrowthRate2W'],
color='gray', s=40, marker='o', label='DLC Release', zorder=5)
axes[0].scatter(sequel_first_day['Date'],
sequel_first_day['PlayerGrowthRate2W'],
color='purple', s=40, marker='o', label='Sequel Release', zorder=5)
axes[1].plot(sub_df['Date'], sub_df['FollowersGrowthRate2W'], label='Followers', color='orange')
axes[1].set_title('FollowersGrowthRate2W')
axes[1].scatter(dlc_first_day['Date'],
dlc_first_day['FollowersGrowthRate2W'],
color='gray', s=40, marker='o', label='DLC Release', zorder=5)
axes[1].scatter(sequel_first_day['Date'],
sequel_first_day['FollowersGrowthRate2W'],
color='purple', s=40, marker='o', label='Sequel Release', zorder=5)
axes[2].plot(sub_df['Date'], sub_df['PositiveRateGrowthRate2W'], label='Positive Rate', color='red')
axes[2].set_title('PositiveRateGrowthRate2W')
axes[2].scatter(dlc_first_day['Date'],
dlc_first_day['PositiveRateGrowthRate2W'],
color='gray', s=40, marker='o', label='DLC Release', zorder=5)
axes[2].scatter(sequel_first_day['Date'],
sequel_first_day['PositiveRateGrowthRate2W'],
color='purple', s=40, marker='o', label='Sequel Release', zorder=5)
# ---- 標記打折期間 ----
sale_labels = {'Seasonal Sale': False, 'Non-Seasonal Sale': False}
for _, row in sub_df.iterrows():
if row['DiscountOrNot'] == 1:
start_date = row['Date']
end_date = row['Date'] + timedelta(days=row['DiscountDuration'])
# 判斷季節/非季節折扣
if row['SalePeriod'] == 1:
color = 'red'
label = 'Seasonal Sale'
else:
color = 'blue'
label = 'Non-Seasonal Sale'
# 只讓第一次出現的區間加上 label
for ax in axes:
ax.axvspan(start_date, end_date, color=color, alpha=0.15,
label=label if not sale_labels[label] else None)
sale_labels[label] = True
# ---- 共用設定 ----
for ax in axes:
ax.legend()
ax.grid(True, linestyle='--', alpha=0.3)
plt.xlabel('Date')
plt.xticks(rotation=45)
plt.tight_layout(rect=[0.05, 0.05, 1, 0.95]) # 保留空間給標題和y標籤
plt.show()
if index == 5:
break
window = 7
event_data = []
for _, row in df[df['DiscountOrNot'] == 1].iterrows():
date = row['Date']
event_window = df[(df['Date'] >= date - pd.Timedelta(days=window)) &
(df['Date'] <= date + pd.Timedelta(days=window))].copy()
event_window['DaysFromEvent'] = (event_window['Date'] - date).dt.days
event_data.append(event_window)
event_df = pd.concat(event_data)
avg_change = event_df.groupby('DaysFromEvent')[['PlayerGrowthRate2W', 'FollowersGrowthRate2W', 'PositiveRateGrowthRate2W']].mean()
scaler = StandardScaler()
avg_change_scaled = avg_change.copy()
avg_change_scaled[:] = scaler.fit_transform(avg_change)
avg_change_scaled.plot(title='Standardized Change Around Discount Start', figsize=(10,5))
plt.axvline(0, color='red', linestyle='--', label='Discount Start')
plt.legend()
plt.show()
window = 7
event_data = []
for _, row in df[df['DiscountDuringSale'] == 1].iterrows():
date = row['Date']
event_window = df[(df['Date'] >= date - pd.Timedelta(days=window)) &
(df['Date'] <= date + pd.Timedelta(days=window))].copy()
event_window['DaysFromEvent'] = (event_window['Date'] - date).dt.days
event_data.append(event_window)
event_df = pd.concat(event_data)
avg_change = event_df.groupby('DaysFromEvent')[['PlayerGrowthRate2W', 'FollowersGrowthRate2W', 'PositiveRateGrowthRate2W']].mean()
scaler = StandardScaler()
avg_change_scaled = avg_change.copy()
avg_change_scaled[:] = scaler.fit_transform(avg_change)
avg_change_scaled.plot(title='Standardized Change Around Seasonal Discount Start', figsize=(10,5))
plt.axvline(0, color='red', linestyle='--', label='Discount Start')
plt.legend()
plt.show()
window = 7
event_data = []
for _, row in df[df['DiscountOutOfSale'] == 1].iterrows():
date = row['Date']
event_window = df[(df['Date'] >= date - pd.Timedelta(days=window)) &
(df['Date'] <= date + pd.Timedelta(days=window))].copy()
event_window['DaysFromEvent'] = (event_window['Date'] - date).dt.days
event_data.append(event_window)
event_df = pd.concat(event_data)
avg_change = event_df.groupby('DaysFromEvent')[['PlayerGrowthRate2W', 'FollowersGrowthRate2W', 'PositiveRateGrowthRate2W']].mean()
scaler = StandardScaler()
avg_change_scaled = avg_change.copy()
avg_change_scaled[:] = scaler.fit_transform(avg_change)
avg_change_scaled.plot(title='Standardized Change Around Non-seasonal Discount Start', figsize=(10,5))
plt.axvline(0, color='red', linestyle='--', label='Discount Start')
plt.legend()
plt.show()
for index, sub_df in enumerate(game_dfs):
dlc_first_day = sub_df[
(sub_df['DLC_sum_1W'] == 1) &
(sub_df['DLC_sum_1W'].shift(1) != 1)
]
sequel_first_day = sub_df[
(sub_df['Sequel_sum_1W'] == 1) &
(sub_df['Sequel_sum_1W'].shift(1) != 1)
]
fig, axes = plt.subplots(3, 1, figsize=(15, 9), sharex=True)
# ---- 大圖標題 & 全圖Y軸標籤 ----
gid = sub_df['GameID'].iloc[0] # 假設每個sub_df都有GameID欄位
fig.suptitle(f'Game {gid} Time Series', fontsize=14, y=0.95) # 給整張圖標題
fig.text(0.04, 0.5, 'Rate', va='center', rotation='vertical', fontsize=12) # 給整張圖的y軸標籤
# ---- 各子圖 ----
axes[0].plot(sub_df['Date'], sub_df['PlayerGrowthRate1M'], label='Players', color='blue')
axes[0].set_title('PlayerGrowthRate1M')
axes[0].scatter(dlc_first_day['Date'],
dlc_first_day['PlayerGrowthRate1M'],
color='gray', s=40, marker='o', label='DLC Release', zorder=5)
axes[0].scatter(sequel_first_day['Date'],
sequel_first_day['PlayerGrowthRate1M'],
color='purple', s=40, marker='o', label='Sequel Release', zorder=5)
axes[1].plot(sub_df['Date'], sub_df['FollowersGrowthRate1M'], label='Followers', color='orange')
axes[1].set_title('FollowersGrowthRate1M')
axes[1].scatter(dlc_first_day['Date'],
dlc_first_day['FollowersGrowthRate1M'],
color='gray', s=40, marker='o', label='DLC Release', zorder=5)
axes[1].scatter(sequel_first_day['Date'],
sequel_first_day['FollowersGrowthRate1M'],
color='purple', s=40, marker='o', label='Sequel Release', zorder=5)
axes[2].plot(sub_df['Date'], sub_df['PositiveRateGrowthRate1M'], label='Positive Rate', color='red')
axes[2].set_title('PositiveRateGrowthRate1M')
axes[2].scatter(dlc_first_day['Date'],
dlc_first_day['PositiveRateGrowthRate1M'],
color='gray', s=40, marker='o', label='DLC Release', zorder=5)
axes[2].scatter(sequel_first_day['Date'],
sequel_first_day['PositiveRateGrowthRate1M'],
color='purple', s=40, marker='o', label='Sequel Release', zorder=5)
# ---- 標記打折期間 ----
sale_labels = {'Seasonal Sale': False, 'Non-Seasonal Sale': False}
for _, row in sub_df.iterrows():
if row['DiscountOrNot'] == 1:
start_date = row['Date']
end_date = row['Date'] + timedelta(days=row['DiscountDuration'])
# 判斷季節/非季節折扣
if row['SalePeriod'] == 1:
color = 'red'
label = 'Seasonal Sale'
else:
color = 'blue'
label = 'Non-Seasonal Sale'
# 只讓第一次出現的區間加上 label
for ax in axes:
ax.axvspan(start_date, end_date, color=color, alpha=0.15,
label=label if not sale_labels[label] else None)
sale_labels[label] = True
# ---- 共用設定 ----
for ax in axes:
ax.legend()
ax.grid(True, linestyle='--', alpha=0.3)
plt.xlabel('Date')
plt.xticks(rotation=45)
plt.tight_layout(rect=[0.05, 0.05, 1, 0.95]) # 保留空間給標題和y標籤
plt.show()
if index == 5:
break
window = 7
event_data = []
for _, row in df[df['DiscountOrNot'] == 1].iterrows():
date = row['Date']
event_window = df[(df['Date'] >= date - pd.Timedelta(days=window)) &
(df['Date'] <= date + pd.Timedelta(days=window))].copy()
event_window['DaysFromEvent'] = (event_window['Date'] - date).dt.days
event_data.append(event_window)
event_df = pd.concat(event_data)
avg_change = event_df.groupby('DaysFromEvent')[['PlayerGrowthRate1M', 'FollowersGrowthRate1M', 'PositiveRateGrowthRate1M']].mean()
scaler = StandardScaler()
avg_change_scaled = avg_change.copy()
avg_change_scaled[:] = scaler.fit_transform(avg_change)
avg_change_scaled.plot(title='Standardized Change Around Discount Start', figsize=(10,5))
plt.axvline(0, color='red', linestyle='--', label='Discount Start')
plt.legend()
plt.show()
window = 7
event_data = []
for _, row in df[df['DiscountDuringSale'] == 1].iterrows():
date = row['Date']
event_window = df[(df['Date'] >= date - pd.Timedelta(days=window)) &
(df['Date'] <= date + pd.Timedelta(days=window))].copy()
event_window['DaysFromEvent'] = (event_window['Date'] - date).dt.days
event_data.append(event_window)
event_df = pd.concat(event_data)
avg_change = event_df.groupby('DaysFromEvent')[['PlayerGrowthRate1M', 'FollowersGrowthRate1M', 'PositiveRateGrowthRate1M']].mean()
scaler = StandardScaler()
avg_change_scaled = avg_change.copy()
avg_change_scaled[:] = scaler.fit_transform(avg_change)
avg_change_scaled.plot(title='Standardized Change Around Seasonal Discount Start', figsize=(10,5))
plt.axvline(0, color='red', linestyle='--', label='Discount Start')
plt.legend()
plt.show()
window = 7
event_data = []
for _, row in df[df['DiscountOutOfSale'] == 1].iterrows():
date = row['Date']
event_window = df[(df['Date'] >= date - pd.Timedelta(days=window)) &
(df['Date'] <= date + pd.Timedelta(days=window))].copy()
event_window['DaysFromEvent'] = (event_window['Date'] - date).dt.days
event_data.append(event_window)
event_df = pd.concat(event_data)
avg_change = event_df.groupby('DaysFromEvent')[['PlayerGrowthRate1M', 'FollowersGrowthRate1M', 'PositiveRateGrowthRate1M']].mean()
scaler = StandardScaler()
avg_change_scaled = avg_change.copy()
avg_change_scaled[:] = scaler.fit_transform(avg_change)
avg_change_scaled.plot(title='Standardized Change Around Non-seasonal Discount Start', figsize=(10,5))
plt.axvline(0, color='red', linestyle='--', label='Discount Start')
plt.legend()
plt.show()
heatmap_data = df.groupby(['DLC_sum_1W', 'Sequel_sum_1W'])['DiscountOrNot'].mean().unstack()
sns.heatmap(heatmap_data, cmap='coolwarm', annot=False)
plt.title('Discount Probability by DLC and Sequel release in 1 week')
plt.xlabel('the number of Sequel release')
plt.ylabel('the number of DLC release')
plt.show()
這些遊戲在我們指定的時間範圍內沒有同時發行DLC和續作,發行DLC打折的比率高於發出續作。
len(df[df['DLC_sum_1W'] == 2])
6
heatmap_data = df.groupby(['DLC_sum_2W', 'Sequel_sum_2W'])['DiscountOrNot'].mean().unstack()
sns.heatmap(heatmap_data, cmap='coolwarm', annot=False)
plt.title('Discount Probability by DLC and Sequel release in 2 week')
plt.xlabel('the number of Sequel release')
plt.ylabel('the number of DLC release')
plt.show()
heatmap_data = df.groupby(['DLC_sum_1M', 'Sequel_sum_1M'])['DiscountOrNot'].mean().unstack()
sns.heatmap(heatmap_data, cmap='coolwarm', annot=False)
plt.title('Discount Probability by DLC and Sequel release in 1 month')
plt.xlabel('the number of Sequel release')
plt.ylabel('the number of DLC release')
plt.show()
heatmap_data = df.groupby(['MultiPlayer', 'DiscountFreq3M'])['DiscountOrNot'].mean().unstack()
sns.heatmap(heatmap_data, cmap='coolwarm', annot=False)
plt.title('Discount Probability by MultiPlayer and Discount Frequency in 3 Month')
plt.xlabel('Discount Frequency (3 Months)')
plt.ylabel('MultiPlayer (0 = Single, 1 = Multi)')
plt.show()